Add binned summary statistic aggregation for genomic intervals — Closes #61 by conradbzura · Pull Request #62 · abdenlab/giql

conradbzura · 2026-03-10T19:50:36Z

Summary

Add the COVERAGE operator for computing binned genome coverage from interval data. The operator tiles the genome into fixed-width bins using generate_series, LEFT-JOINs against the source table, and aggregates overlapping intervals per bin. It supports configurable statistics (count, mean, sum, min, max), an optional target column for aggregating a named field instead of interval length, and both := and => named parameter syntax. The generated SQL runs against both DuckDB and PostgreSQL.

In addition to the operator itself, this branch lands a unit-test suite mirroring src/giql/ under tests/unit/, a bedtools-backed integration-correctness suite under tests/integration/bedtools/test_correctness_*.py, and the shared test-infrastructure helpers that support them. Every test follows the project's BDD naming, Given-When-Then docstrings, and Arrange-Act-Assert phase comments.

Closes #61

Proposed changes

Expression node (`GIQLCoverage`)

Register GIQLCoverage as a SQLGlot Func. The from_arg_list classmethod delegates to the shared _split_named_and_positional helper alongside the other Func subclasses, separating named parameters via PropertyEQ (:=) and Kwarg (=>) from positional arguments. Parameters: this (interval column), resolution (bin width, required), stat (aggregation function), and target (column to aggregate).

Parser registration

Register COVERAGE as a function in GIQLDialect so parse_one(..., dialect=GIQLDialect) produces a GIQLCoverage AST node. Exclude = from named-parameter syntax to avoid ambiguity with SQL equality.

Transformer (`CoverageTransformer`)

Rewrite queries containing COVERAGE(...) into a CTE-based plan:

A __giql_chroms subquery computes per-chromosome MAX(end) from the source table. When the user-query FROM clause carries an alias, the subquery carries the same alias so alias-qualified WHERE predicates resolve.
A __giql_bins CTE uses CROSS JOIN LATERAL generate_series(0, max_end - 1, resolution) to tile each chromosome — the - 1 clamp avoids a spurious trailing empty bin when MAX(end) lands on a bin boundary.
A LEFT JOIN matches source intervals to bins, with any original WHERE conditions moved into the ON clause to preserve zero-coverage bins.
GROUP BY and the selected aggregate — a typed SQLGlot node (exp.Count, exp.Avg, exp.Sum, exp.Min, exp.Max) — produce one row per bin.
When no explicit AS alias is provided, the aggregate defaults to value.

User CTEs declared before the COVERAGE query are preserved alongside the internal __giql_bins CTE, so the operator composes with pre-filtering via the CTE-wrap pattern. Non-table FROM clauses (inline subqueries, VALUES) raise a typed ValueError at transpile time with a message pointing users at the CTE workaround. stat, target, and resolution must be string or integer literals — non-literal arguments raise typed errors rather than silently stringifying AST nodes. Negative or zero resolutions are rejected at transpile time. The default COUNT aggregate counts a non-null source column (not source.*), which gives value=0 for zero-coverage bins on both DuckDB and PostgreSQL.

CoverageTransformer reuses ClusterTransformer._get_table_name/_get_genomic_columns via delegation rather than duplicating them.

Documentation

Add COVERAGE to the aggregation-operators reference (docs/dialect/aggregation-operators.rst) with syntax, parameter descriptions (including the required flag on resolution), a sample output table, a LATERAL + generate_series compatibility note (DuckDB/PostgreSQL only — SQLite unsupported), and a "Supported FROM clauses" section explaining the CTE workaround for inline subqueries. List COVERAGE in the dialect operators index (docs/dialect/index.rst). Add a recipes page (docs/recipes/coverage.rst) with biological-context framing and patterns for basic coverage, named-resolution syntax, coverage statistics, filtered coverage, strand-specific coverage, target-column aggregation, 5' end counting (with correctly-quoted reserved-keyword identifiers), and RPM normalisation.

Unit tests

Add tests/unit/ mirroring src/giql/ with module-level test files for dialect, expressions, generators/base, table, transformer, and transpile. TestGIQLCoverage and TestCoverageTransformer cover parsing and transpilation end-to-end, including property-based tests with explicit @settings(max_examples=50) control, DuckDB execution tests that assert per-bin values across the full tiling, and regression tests for every correctness concern: LEFT-JOIN zero-coverage semantics, alias-qualified WHERE, user-CTE preservation, endpoint-clamped generate_series, non-positive-resolution rejection, and non-literal-stat/target rejection.

Integration tests

Add four correctness files under tests/integration/bedtools/: test_correctness_intersect.py, test_correctness_merge.py, test_correctness_nearest.py, and test_correctness_workflows.py. They exercise real bedtools subprocesses via pybedtools and compare against GIQL-transpiled SQL executed on real DuckDB. Tests are tagged with an integration pytest marker (registered in pyproject.toml) declared at module level in every integration file. A new generate_random_intervals helper produces deterministic multi-chromosome samples for scale tests.

Test infrastructure

Move the bedtools helper tests (test_bedtools_wrapper.py, test_comparison.py, test_duckdb_loader.py, test_data_models.py) to live alongside the helpers in tests/integration/bedtools/utils/, where they mirror their source modules. load_intervals accepts empty interval lists and creates an empty table rather than propagating duckdb.InvalidInputException.

Test cases

#	Test Module	Covers
1	`tests/unit/test_expressions.py::TestGIQLCoverage`	COVERAGE parsing: positional args, named args via `:=` and `=>` syntax, all three parameter names (stat, target, resolution) reachable by either named form, plus property-based tests across 1-10M resolutions and every valid stat value
2	`tests/unit/test_transformer.py::TestCoverageTransformer`	COVERAGE transpilation: CTE structure, stat→SQL aggregate mapping, target-column aggregation, default `value` alias, explicit alias preservation, WHERE→ON migration, column qualification, alias propagation into chroms subquery, custom column mappings, table aliases, CTE recursion, multi-COVERAGE rejection, invalid-stat rejection, non-positive-resolution rejection, subquery-FROM rejection, non-literal-stat/target rejection, user-CTE preservation, endpoint-clamp off-by-one, and end-to-end DuckDB execution across all five stats plus zero-coverage and WHERE-preserves-zero-bins scenarios
3	`tests/unit/test_dialect.py`	Dialect-level parsing: GIQLDialect produces GIQLCoverage nodes from COVERAGE calls; other operators (CLUSTER, MERGE, DISTANCE, NEAREST) still parse correctly
4	`tests/unit/test_generators_base.py`	BaseGIQLGenerator produces correct SQL for all operators, including alias resolution and custom column mappings
5	`tests/unit/test_table.py`, `tests/unit/test_transpile.py`	Table/Tables registry behavior and end-to-end `transpile()` entry point
6	`tests/integration/bedtools/test_correctness_intersect.py`	INTERSECT parity with bedtools intersect across overlap configurations, containment, deduplication, non-standard chroms, large intervals, scale, and same/opposite-strand filters
7	`tests/integration/bedtools/test_correctness_merge.py`	MERGE parity with bedtools merge across topology configurations, overlap boundaries, unsorted input, per-chromosome segregation, strand-specific merge, and large-scale datasets
8	`tests/integration/bedtools/test_correctness_nearest.py`	NEAREST parity with bedtools closest across overlap, upstream, downstream, multi-query, k>1, strand-aware, cross-chromosome, and scale scenarios — with distance assertions pinned to the bedtools 2.31 half-open gap convention
9	`tests/integration/bedtools/test_correctness_workflows.py`	Chained pipelines: intersect→merge, nearest-then-filter-distance, merge→intersect, stranded intersect→merge, intersect-filter-merge, and the full intersect→merge→nearest pipeline with each GIQL stage feeding the next
10	`tests/integration/bedtools/utils/test_*`	Test infrastructure: bedtools wrapper, result comparison, DuckDB loader (including the empty-input success case), and `GenomicInterval`/`ComparisonResult` data models

Total: 533 tests (357 unit + 176 integration). Every test follows BDD naming (test_<method>_should_<outcome>[_when_<condition>]), Given-When-Then docstrings on test functions and methods, and Arrange-Act-Assert phase comments.

conradbzura · 2026-03-11T20:25:19Z

test_coverage.py has a number of tests that can be consolidated into a few property-based tests. We also need functional tests confirming query behavior — can use DuckDB for this.

conradbzura · 2026-03-12T17:58:09Z

+   - ``'min'`` — minimum interval length of overlapping intervals
+   - ``'max'`` — maximum interval length of overlapping intervals
+
+   When ``target`` is specified, the stat is applied to that column instead of interval length.


@nvictus is interval length a sane default for target metric?

Define a new GIQLCoverage(exp.Func) AST node with this, resolution, and stat arg_types. The from_arg_list classmethod handles both positional and named parameters (EQ and PropertyEQ for := syntax). Register COVERAGE in GIQLDialect.Parser.FUNCTIONS so the parser recognises it.

CoverageTransformer rewrites SELECT COVERAGE(interval, N) queries into a CTE-based plan: a __giql_bins CTE built from generate_series via LATERAL, LEFT JOINed to the source table on overlap, with GROUP BY and the appropriate aggregate (COUNT, AVG, SUM, MIN, MAX). Wire the transformer into the transpile() pipeline before MERGE and CLUSTER.

TestCoverageParsing (3 tests) verifies positional args, named stat via :=, and named resolution. TestCoverageTranspile (11 tests) covers basic transpilation, stat variants (mean/sum/max), custom column mappings, WHERE preservation, additional SELECT columns, table alias handling, resolution in generate_series, overlap join conditions, and ORDER BY output.

Add a COVERAGE section to aggregation-operators.rst with description, syntax, parameters, return value, examples, and related operators. Create docs/recipes/coverage.rst with strand-specific coverage, coverage statistics, filtered coverage, 5-prime end counting, and RPM normalisation recipes. Add coverage to the recipe index.

Add exp.Kwarg handling alongside exp.PropertyEQ in from_arg_list so that COVERAGE(interval, 1000, stat => 'mean') works identically to the := form. Update the reference docs to show both syntaxes and add a parsing test for the => form.

The = operator inside a function call is an equality comparison in standard SQL, not parameter assignment. Only := (PropertyEQ) and => (Kwarg) are valid named parameter syntaxes. This makes COVERAGE consistent with SQL semantics and allows = to be used as a boolean expression argument.

Remove unused generate_series_sql variable and unwrap the redundant exp.Subquery wrapper inside exp.Lateral. The old form emitted CROSS JOIN LATERAL (GENERATE_SERIES(...)) which DuckDB rejects due to the extra parentheses. The new form emits CROSS JOIN LATERAL GENERATE_SERIES(...) which works on both DuckDB and PostgreSQL.

Add optional target parameter to GIQLCoverage that specifies which column to aggregate instead of defaulting to interval length (end - start). When target is set, COUNT uses COUNT(target_col) instead of COUNT(*), and other stats (mean, sum, min, max) aggregate the named column. Bare COVERAGE expressions without an explicit AS alias now default to AS value.

The original query's WHERE was applied to the outer query, which filtered out zero-coverage bins because source columns are NULL for non-matching LEFT JOIN rows (NULL > threshold evaluates to FALSE). Moving the WHERE into the JOIN's ON clause preserves all bins while still filtering which source rows participate. Also qualify unqualified column references with the source table in both the JOIN ON condition and the chroms subquery WHERE to avoid ambiguous column errors.

Replace the ad-hoc test classes with two spec-aligned classes: - TestGIQLCoverage (10 tests): example-based parsing for positional args, :=/=> named params, target parameter, and all-named-params; property-based tests for stat+resolution combos, positional-only, and target syntax variants. - TestCoverageTransformer (26 tests): instantiation, basic transpilation, all five stats, target with count/non-count, default and explicit aliases, WHERE-to-ON migration with column qualification, custom column mapping, table alias, resolution propagation, CTE nesting, error paths (invalid stat, multiple COVERAGE), and five DuckDB end-to-end functional tests. Update docs to document the target parameter, default value alias, and add a recipe for aggregating a specific column.

Cover bedtools_wrapper, comparison, data_models, and duckdb_loader utility modules used by the integration test suite.

Cover dialect parser, expression nodes, BaseGIQLGenerator, table metadata, ClusterTransformer, MergeTransformer, CoverageTransformer, and the public transpile() API.

Compare GIQL INTERSECTS, MERGE, and NEAREST output against bedtools results across edge cases, strand handling, scale, and multi-step workflow pipelines.

COUNT(source.*) in the no-target branch counted LEFT-JOIN-unmatched rows as 1 on DuckDB and 0 on PostgreSQL. DuckDB includes all-NULL composite rows in the count, so every bin with no overlapping interval returned value=1 instead of 0 — violating the portability and canonicality principles and contradicting the PR's own zero-coverage test case. Counting the source chrom column instead collapses unmatched rows to NULL on both backends, so empty bins correctly return 0. Add a regression test that forces genuinely-empty middle bins (two intervals 2kb apart with resolution=500), asserting value=0 on the four intermediate bins. This test failed on DuckDB before the fix.

When a COVERAGE query's FROM clause uses a table alias and the WHERE qualifies columns by that alias (FROM features f WHERE f.score > 10), the alias-qualified predicate was copied verbatim into the __giql_chroms subquery but the subquery's own FROM used only the bare table name. DuckDB rejected the resulting SQL with a binder error because the alias was unresolvable inside the subquery. Forward the alias into the chroms subquery's FROM so alias-qualified columns resolve. Add a regression test that runs COVERAGE over FROM features f WHERE f.score > 10 against DuckDB and asserts the filtered result contains the expected bins.

The final query built by CoverageTransformer.set("with_", ...) unconditionally replaced query.args["with_"], dropping any CTE the user declared before SELECT COVERAGE(...). WITH selected AS (...) SELECT COVERAGE(interval, 1000) FROM selected produced a query that referenced "selected" after the CTE had been thrown away, failing at bind time with "Table does not exist". Mirror the MergeTransformer fix from 185b716 by merging existing user CTEs with the newly-built __giql_bins CTE. Add a regression test that wraps COVERAGE over a user-defined CTE and asserts the filtered bin count.

Negative literals parsed as Neg(Literal(N)) and the existing fallback int(str(resolution_expr.this)) silently stripped the sign, so COVERAGE(interval, -1) emitted generate_series(0, __max_end, 1) — a valid but wildly different query than what the user wrote. A zero resolution was also accepted and produced a query that failed at execution with a backend-specific "step cannot be zero" error instead of a clear GIQL error. Replace the lossy str-conversion fallback with explicit Neg-of-Literal handling that preserves the sign, then validate resolution > 0 and raise a typed ValueError otherwise. Add regression tests for negative and zero resolutions.

GIQLCoverage.from_arg_list reinlined the named/positional split that the four sibling Func subclasses already delegate to the module-level helper. Replace the inline loop with a helper call so the parsing behaviour stays in one place.

CoverageTransformer had byte-for-byte copies of _get_table_name and a strand-less variant of _get_genomic_columns already on ClusterTransformer. MergeTransformer handles this by holding a ClusterTransformer and delegating; mirror that pattern here so changes to the column-lookup logic only need to happen in one place. _get_table_alias has no counterpart elsewhere and stays on CoverageTransformer.

…rmer Per the style guide's class-member ordering rule, public methods must precede private helpers. _get_table_alias was sitting between __init__ and transform; move it below transform alongside the other private methods so readers see the public entry point first.

generate_series is endpoint-inclusive on both DuckDB and PostgreSQL, so calling it with (0, MAX(end), resolution) yields an extra series element whenever MAX(end) lands exactly on a bin boundary. The LEFT JOIN then emits a spurious bin beyond any interval in the data — e.g., an interval ending at position 1000 with resolution=1000 produced bins [0,1000) and [1000,2000) instead of just the first. Subtract one from MAX(end) inside generate_series so the series stops at the last strictly-occupied byte. Add a regression test that places an interval ending exactly on a bin boundary and asserts only the occupied bin is returned.

When the FROM clause is an inline subquery or VALUES rather than a named table reference, _get_table_name returned None and the transformer silently emitted SQL that (a) built the chroms subquery with no FROM clause at all and (b) referenced a literal nonexistent table named "source" in the LEFT JOIN. Users saw cryptic runtime errors with no path back to their GIQL query. Raise ValueError at transpile time with a clear message pointing at the FROM clause. Keep CTE-by-name FROMs working — those parse as exp.Table and yield the CTE name. Add a regression test using a subquery FROM.

The stringify-fallback for non-Literal stat and target arguments silently accepted AST nodes, stringified them, and tried to use the result as a stat keyword or column name. For stat this produced a confusing "Unknown COVERAGE stat '<stringified-node>'" message when users wrote stat := some_col without quotes. For target it silently succeeded because COUNT(source.score) happens to work when score is a real column, masking user intent drift. Replace both fallbacks with explicit Literal checks that raise a typed ValueError. Add regression tests for both parameters.

The COVERAGE operator reference page did not explain what FROM shapes are accepted, and the ValueError raised when an unsupported FROM is encountered did not point users at the CTE-wrap workaround. Add a "Supported FROM clauses" section with a before/after example and extend the error message to tell users to wrap the derivation in a WITH clause. Also note that user-defined CTEs are preserved alongside the internal __giql_bins CTE, so composing COVERAGE over a pre-filtering CTE is the canonical way to work around the inline-subquery limitation.

The COVERAGE operator reference page existed but was not linked from the dialect landing page's Aggregation Operators table, making the new operator undiscoverable from the docs TOC. The MERGE page's Related Operators section already cross-references COVERAGE, so the index omission was also internally inconsistent.

"end" is a reserved keyword in ANSI SQL (used in CASE ... END) and "start" is reserved in some dialects. DuckDB tolerates them unquoted in many contexts but PostgreSQL in strict mode and other engines will reject the unquoted identifiers. The rest of the docs and the COVERAGE transpiler emit these as double-quoted identifiers, so the recipe should match to stay portable across the GIQL-supported backends.

The test was named _distance_zero with a docstring claiming "adjacent intervals report distance=0", but the assertion accepted <= 1 and an inline comment admitted bedtools 2.31+ reports 1. The name, docstring, and assertion disagreed — the test body didn't actually verify the behavior the name claimed, breaking the test guide's core principle. Rename to describe the real behavior (finding the correct adjacent neighbor — a parity test, not a distance-value claim), rewrite the docstring in the required Given/When/Then format, and pin the bedtools distance assertion to the canonical value of 1 for the pinned bedtools >= 2.31.0 dependency. Add AAA phase comments while rewriting the body.

The previous test promised "full pipeline (intersect -> merge -> nearest) is run then each intermediate step matches bedtools" but never ran the nearest step (intervals_c was loaded then unused) and bypassed GIQL mid-pipeline by reconstructing step 1 with a hand-written raw-SQL overlap predicate instead of feeding the verified GIQL step-1 output into step 2. Rewrite so that each stage's GIQL output is materialized as a table and fed into the next GIQL stage — step 1 → step1_results table → step 2 → step2_results table → step 3. Switch the inputs from random generation to hand-crafted deterministic intervals so the nearest step's correctness is unambiguous. Assert row equality for intersect and merge; assert (a_name, b_name) pair equality for nearest because bedtools 2.31+ uses the N+1 half-open gap distance convention while GIQL uses N (distance-value parity is already covered by the dedicated nearest tests). Apply the BDD naming pattern, GWT docstring, and AAA phase comments while rewriting the test body.

pytest only reads module-level pytestmark from test modules, not from conftest.py. The conftest declared pytestmark = pytest.mark.integration but none of the test files did, so -m integration and -m "not integration" selection was a no-op and strict-markers mode would error on the unregistered marker. Register the integration marker in pyproject.toml and add the module pytestmark to every test_*.py file under tests/integration/bedtools/. Remove the dead pytestmark line from conftest.py to avoid implying it propagates. With this change pytest -m integration now selects all 103 integration tests and -m "not integration" correctly deselects them.

…on correctness tests The four tests/integration/bedtools/test_correctness_*.py files used ALL-CAPS GIVEN/WHEN/THEN docstrings without a leading summary, no Arrange/Act/Assert phase comments, and test names that described scenarios without a should_<outcome> clause. All three violate the Python Test Guide's core principles (naming §3, GWT docstrings §4, AAA phase comments §5), and the project has resolved the naming conflict with the older constitution example in favour of the Test Guide's BDD pattern. Rename every test to test_<method>_should_<outcome>[_when_<condition>] form, rewrite every docstring with a leading "Test <summary>." line, blank line, and indented Given/When/Then blocks, and add Arrange/Act/Assert phase comments to every test body. Also fix the stale "grep chr1" reference in the intersect-filter-merge workflow docstring to describe the actual Python-side filter. Helper functions (_run_*_comparison, _load_and_query_nearest) keep their plain docstrings since the guide restricts GWT to test functions and methods. All 35 tests still pass.

Three files under tests/unit/ tested helpers that live in tests/integration/bedtools/utils/. Per the test guide, tests must mirror the directory structure of the modules under test, and tests for test infrastructure should sit alongside the infrastructure — not in the project-level unit test package. The same three files also wrapped module-level function tests in Test<Function> classes, violating the guide's rule that module-level functions get module-level tests. Move the files: tests/unit/test_bedtools_wrapper.py -> tests/integration/bedtools/utils/ tests/unit/test_comparison.py -> tests/integration/bedtools/utils/ tests/unit/test_duckdb_loader.py -> tests/integration/bedtools/utils/ Convert TestCreateBedtool, TestIntersect, TestMerge, TestClosest, TestBedtoolToTuples, TestCompareResults, and TestLoadIntervals from class wrappers to module-level functions. TestBedtoolsError stays a class because it wraps a real exception class. Update imports to relative form and add pytestmark = pytest.mark.integration to each moved module (these exercise real bedtools subprocess + DuckDB).

… duplicate tests/test_coverage.py duplicated TestGIQLCoverage (also present in tests/unit/test_expressions.py) and TestCoverageTransformer (also present in tests/unit/test_transformer.py) with stronger content in the root file — more tests, proper AAA phase comments, proper Given-When-Then docstrings, property tests, and end-to-end DuckDB coverage. The root file also violated the test guide's mirror-the- source-module convention because there is no src/giql/coverage.py module. Replace the weaker TestGIQLCoverage in tests/unit/test_expressions.py with the 10-test version from the root file (up from 7). Replace the weaker TestCoverageTransformer in tests/unit/test_transformer.py with the 37-test version (up from 20). Carry over the Hypothesis, duckdb, Table, and transpile imports needed by the merged content, plus the VALID_STATS module constant. The to_df fixture already lives in tests/conftest.py so it is auto-discovered by tests/unit/ without a new conftest. Delete tests/test_coverage.py. Net: +47 coverage tests in unit/ - 27 weaker duplicates - 1222 lines of the root file = 89 tests in the two touched unit files after the merge (up from ~57).

tests/unit/test_data_models.py tests classes from tests/integration/bedtools/utils/data_models.py. Same mirror-the- source-module concern that motivated the prior helper-test relocations — this one was missed because the original review finding did not mention it explicitly. Move to tests/integration/bedtools/utils/test_data_models.py, switch to relative imports, add the integration marker. All 24 tests pass from the new location.

…tests The test files under tests/unit/ (and the helper-tests moved into tests/integration/bedtools/utils/) used a mix of spec-ID-prefixed names (BG-001, CSM-001, CT-007, DC-001, GR-001, GD-001, MT-001, SP-001, SSP-001, DI-001, NR-001, CL-001), scenario-only names (test_contains, test_iter), and ALL-CAPS GIVEN/WHEN/THEN docstrings without a leading summary. None of them had Arrange/Act/Assert phase comments. All three violate Python Test Guide §3/§4/§5. Sweep all ten files: - Rename every test to test_<method>_should_<outcome>[_when_<condition>] form where <method> matches the __name__ of the method or function under test exactly; dunder methods use the dunder form (test___init___should_..., test___contains___should_..., etc.). Strip spec-ID prefixes from test names. - Replace every docstring with the "Test <summary>." + blank line + indented Given:/When:/Then: block form required by §4. - Add # Arrange / # Act / # Assert phase comments (or combined forms where phases are inseparable) to every test body. Classes with already-compliant docstrings and AAA comments from the earlier tests/test_coverage.py merge (TestGIQLCoverage in test_expressions.py and TestCoverageTransformer in test_transformer.py) had only their names updated per B4.10/B4.11/B4.12 — the strong body style was preserved verbatim. Helper functions (_make_tables, _transform_and_sql, _normalize, etc.) and fixtures keep their plain docstrings per the guide's rule that GWT applies only to test functions and methods. Imports, fixtures, assertions, test data, and control flow are unchanged across all touched files. All 533 tests pass.

@given

Test Guide §7 requires property tests to use @settings to control max_examples. Two property tests (in tests/integration/bedtools/utils/ test_comparison.py and test_data_models.py) had @given without any @settings, and five @settings calls in tests/unit/test_expressions.py and tests/unit/test_transformer.py only set suppress_health_check without pinning max_examples — every one of them rode on Hypothesis defaults. Add @settings(max_examples=50) to the two bare property tests (importing settings where needed) and add max_examples=50 to the five existing @settings calls so all seven property tests run a deliberately-chosen sample size.

Tests calling the helper assert on SQL substrings, but the original name "_transform_and_sql" obscured that the helper exercises three stages — parse, transform, and generate — so any failure in the generator surfaced as a transformer-test failure with no signal about which stage was at fault. Rename to _transpile_with_transformer and add a docstring that states the helper runs the full pipeline and that test failures should be triaged across all three stages, not assumed to be transformer bugs. Update all 14 call sites.

Bundle of five minor cleanups flagged in the review: - dialect.py: alphabetize the giql.expressions imports so GIQLCoverage slots between GIQLCluster and GIQLDistance (ruff/isort order). - transformer.py: annotate COVERAGE_STAT_MAP as Final[dict[str, str]] to signal binding immutability. - transformer.py: switch the CoverageTransformer class-docstring summary to the imperative mood required by the style guide. - transformer.py: replace :return: with :returns: across the module (12 occurrences) to match the constitution's mandated reST tag. - transformer.py: drop the redundant "AS <table>" alias emitted on the LEFT JOIN when no user-supplied alias is present — produces LEFT JOIN features rather than LEFT JOIN features AS features. Also removes the dead "source" literal fallback since an earlier fix raises on non-table FROMs before reaching this code.

The transformer was building COUNT/AVG/SUM/MIN/MAX as exp.Anonymous(this=<name>, expressions=[...]) nodes. That form bypasses SQLGlot's dialect-specific rendering — DuckDB and PostgreSQL happen to render these five aggregates identically, but any future backend (or dialect-specific NULL/DISTINCT handling) has to re-learn each aggregate from the string name. Portable-principle smell. Swap to exp.Count/Avg/Sum/Min/Max typed nodes, looked up via a private _AGG_NODE mapping keyed by the SQL name from COVERAGE_STAT_MAP. Also collapse the four-branch if/else that built the aggregate argument into a single-assignment block so the aggregate type and its inner expression are decided in one place. COVERAGE_STAT_MAP remains public (it's imported by tests and acts as the stat-name contract); the typed-node lookup is private.

Six minor clarifications flagged in the review: - aggregation-operators.rst: replace the ambiguous "COVERAGE(interval, resolution)" example — where "resolution" read as either a placeholder or a named-parameter keyword — with the clearer "<bin_width>" placeholder. - aggregation-operators.rst: mark resolution (required) and note that a non-positive value raises ValueError at transpile time. - aggregation-operators.rst: add a compatibility note explaining COVERAGE relies on LATERAL + generate_series (DuckDB and PostgreSQL) and is not currently supported on SQLite. - coverage.rst: add a biological-context paragraph at the top of the "Basic Coverage" section so the recipes open with a motivation matching the framing used in the sibling MERGE/CLUSTER recipes. - coverage.rst: add a "Named Resolution Parameter" recipe demonstrating "resolution := 500" for symmetry with the other named-parameter examples — the reference page shows it but the recipes never did. - coverage.rst: trim the one-character overshoot on the "Coverage of High-Scoring Features" section underline.

…elper Bundle of five related review findings on the integration correctness suite: - Extract the duplicated random-interval generation loop from the three scale tests into a new helper tests/integration/bedtools/utils/random_intervals.py so future scale tests share one deterministic source (NB3.5). - Replace disjunctive "in (300, 301)" distance tolerances in the nearest upstream/downstream tests with exact value 301, matching the pinned bedtools >= 2.31 half-open gap convention (NB3.1). - Replace the sort-then-zip pair comparison in the multi-query nearest test with set-equality on (a_name, b_name) tuples, which is what the docstring actually claims (NB3.2). - Introduce module-level column-index constants (A_NAME, B_CHROM, B_NAME, DISTANCE) in test_correctness_nearest.py so assertions read by name instead of magic positional indices like r[9] (NB3.3). - Strengthen test_merge_should_preserve_strand_when_stranded_true to route through _run_merge_comparison(..., strand_mode="same") and assert coordinate-level equality instead of just row count (NB3.4). - Strengthen the nearest-then-filter-distance workflow to assert set equality on (a_name, b_name) pairs between GIQL and the distance-filtered bedtools output, not just row counts (NB3.6). - Drop the brittle substring-match strand_mode inference in _run_intersect_comparison; accept strand_mode explicitly and forward it to the bedtools wrapper (NB3.7). All 533 tests still pass.

Four related review findings on unit-test quality: - Pin the bedtools distance value in the closest-basic test to the exact 2.31+ output (101 for a 100-base half-open gap) instead of "in (100, 101)". The k>1 test similarly pinned to the exact count bedtools 2.31.1 returns for tied candidates under -t first (NB4.1). - Make load_intervals accept an empty interval list. The helper previously called conn.executemany on a zero-row list and let DuckDB raise InvalidInputException, which the unit test merely documented. Create the table and skip the insert when empty; rewrite the test to assert the empty-table success outcome (NB4.2). - Strengthen two end-to-end COVERAGE tests to assert the full bin-tiling (len, start set, and per-bin values) rather than single-bin spot checks. Fix the zero-coverage-gaps fixture so there is actually a zero-coverage middle bin to observe (NB4.3). - Derive VALID_STATS in tests/unit/test_expressions.py and tests/unit/test_transformer.py from COVERAGE_STAT_MAP so the property-test sample domain stays in sync with the implementation — adding a new stat key in the source automatically extends the property tests (NB4.4).

conradbzura force-pushed the 61-binned-coverage-operator branch from 4b9e1a1 to 10b287a Compare March 11, 2026 20:01

conradbzura mentioned this pull request Mar 11, 2026

Fix named parameter syntax in GIQL operators to use := and => instead of = #63

Closed

conradbzura marked this pull request as ready for review March 11, 2026 20:07